EPD_Spec: Extended Position Description Specification

Revised: 1995.11.26

Technical contact: sje@mv.mv.com (Steven J. Edwards)

1: Introduction

EPD is "Extended Position Description".  It is a standard for describing 
chess positions along with an extended set of structured attribute 
values using the ASCII character set.  It is intended for data and 
command interchange among chessplaying programs.  It is also intended 
for the representation of portable opening library repositories and for 
problem test suites.

EPD is an open standard and is freely available for use by both research 
and commercial programs without charge.  The only requirement for use is 
that any proposed extensions be coordinated through the technical 
contact given at the start of this document.

A single EPD record uses one text line of variable length composed of 
four data fields followed by zero or more operations.  A text file 
composed exclusively of EPD data records should have a file name with 
the suffix ".epd".


2: History

EPD was created in 1993 and is based in part on the earlier FEN standard 
(Forsyth-Edwards Notation) for representing chess positions.  Compared 
to FEN, EPD has added extensions for use with opening library 
preparation and also for general data and command interchange among 
advanced chess programs.  EPD was developed by John Stanback and Steven 
Edwards; its first implementation was in Stanback's commercial 
chessplaying program Zarkov and its second implementation was in 
Edwards' research chessplaying program Spector.  So many programs have 
since adopted EPD that no one knows the exact sequence thereafter.

EPD is employed for storing test suites for chessplaying programs and 
for recording the results of programs running these test suites.  
Example test suites are available for researchers via anonymous ftp from 
the chess.onenet.net site in the pub/chess/Tests directory.  The ASCII 
text file pub/chess/Tests/Manifest gives descriptions of the contents of 
the various test suite files.

EPD is used to provide a linkage mechanism between chessplaying programs 
and position database programs to support the automated direction of 
analysis generation.


3: EPD tools and applications

To encourage development of EPD capable applications, a free EPD tool 
kit is available for program authors working with the ANSI C language.  
To further encourage usage of EPD, a number of free applications are 
also available.


3.1: The EPD Kit

Work is currently in progress on developing an EPD Kit.  This tool kit 
is a collection of portable ANSI C source code files that provide 
routines to create and manipulate EPD data for arbitrarily complex 
records.  It is designed to handle all common EPD related tasks so as to 
assist chess program developers with EPD implementation.  A secondary 
goal is to ensure that every implementation of EPD processing have the 
same set of operational semantics.

The EPD Kit will be made freely available to all chess software authors 
without charge and can be used in both research and commercial 
applications.  As with EPD itself, the only requirement for use is that 
any proposed extensions be coordinated through the technical contact 
given at the start of this document.


3.2: Argus, the automated tournament referee

Work is currently in progress on developing Argus, an automated 
tournament referee program for computer chess events.  Argus uses IP 
(Internet Protocol) communications to act as a mediator for multiple 
pairs of chessplaying programs and to provide an interactive interface 
for a human tournament supervisor.  Argus uses the EPD Kit along with 
other routines to perform the following tasks:

1) Starting chessplaying programs (IP clients) with proper 
initialization data;

2) Relaying position/move data (using EPD) from each program to its 
opponent;

3) Providing all chess clock data as part of the relay process;

4) Record all games using PGN (Portable Game Notation) to assist in the 
production of the tournament final report;

5) Record all moves and other transmitted data in log files for later 
analysis;

6) Detect and report time forfeit conditions;

7) Mediate draw offers and responses between each pair of opponents;

8) Recognize and handle game termination conditions due to draws, 
resignations, time forfeits, and checkmates;

9) Allow for chessplaying program restart and game resumption as 
directed by the human supervisor;

10) Allow for a second instance of itself to operate in observer mode to 
be ready to take over in case of primary machine failure;

11) Support display of games in progress for the benefit of the human 
supervisor and for the general viewing audience.

In its usual configuration, Argus runs on an IP network that connects it 
with all of the participating machines.  It acts like a Unix style 
server using TCP/IP; the chessplaying programs connect to Argus as 
TCP/IP clients.  Unlike a typical Unix style server, it runs in the 
foreground instead of the background when operated by a human 
supervisor.

One variant mode of operation allows for Argus to be started by the host 
system and run in the background.  This use is intended for events where 
human supervision is not required.  Any operating information usually 
provided manually may instead be supplied by configuration files.

Another variant mode of operation allows for Argus to mediate 
communication between a single pair of chessplaying programs using 
regular (unstructured) bidirectional asynchronous serial communication 
instead of IP.  While less reliable than IP operation, unstructured 
serial communication can be used on common inexpensive hardware 
platforms that lack IP support.  An example would be to use common PC 
machines with each chessplaying program running on a separate machine 
and a third machine running Argus in serial mode.  Each of the two 
machines with chessplaying programs connect to the Argus machine via a 
null modem cable.  Note that the Argus machine needs two free serial 
ports while each of the chessplaying machines needs only a single free 
serial port.
The Argus program will be made freely available to all chess software 
authors without charge and can be used in both research and commercial 
applications.  As with EPD itself, the only requirement for use is that 
any proposed extensions be coordinated through the technical contact 
given at the start of this document.


3.3: Gastric, an EPD based report generator

Work is in progress on Gastric, an application that reads EPD files and 
produces statistical reports.  The main use of Gastric is to assist in 
the process of benchmarking chessplaying program performance on EPD test 
suites.  The resulting reports contain summaries of raw performance, 
identification of solved/missed problems, distribution information for 
node count, time consumption, and other items.  Advanced functions of 
Gastric may be used to produce comparative analysis of different 
programs or different versions of the same program.  Some work is also 
planned to allow Gastric output to be used as feedback into 
self-adjusting chessplaying programs.

The Gastric program will be made freely available to all chess software 
authors without charge and can be used in both research and commercial 
applications.  As with EPD itself, the only requirement for use is that 
any proposed extensions be coordinated through the technical contact 
given at the start of this document.


4: The four EPD data fields

Each EPD record contains four data filed that describe the current 
position.  From left to right starting at the beginning of the record, 
these are the piece placement, the active color, the castling 
availability, and the en passant target square of a position.  These can 
all fit on a single text line in an easily read format.  The length of 
an EPD position description varies somewhat according to the position 
and any associated operations. In some cases, the description could be 
eighty or more characters in length and so may not fit conveniently on 
some displays.  However, most EPD records pass among programs only and 
so are not usually seen by program users.

Note: due to the likelihood of future expansion of EPD, implementors are 
encouraged to have their programs handle EPD text lines of up to 4096 
characters long including the traditional ASCII NUL character as a 
terminator.  This is an increase from the earlier suggestion of a 
maximum length of 1024 characters.  Depending on the host operating 
system, the external representation of EPD records will include one or 
more bytes to indicate the end of a line.  These do not count against 
the length limit as the internal representation of an EPD text record is 
stripped of end of line bytes and instead is terminated by the 
traditional ASCII NUL character.

Each of the four EPD data fields are composed only of non-blank printing 
ASCII characters.  Adjacent data fields are separated by a single ASCII 
space character.


4.1: Piece placement data

The first field represents the placement of the pieces on the board.  
The board contents are specified starting with the eighth rank and 
ending with the first rank.  For each rank, the squares are specified 
from file a to file h.  White pieces are identified by uppercase SAN 
(Standard Algebraic Notation) piece letters ("PNBRQK") and black pieces 
are identified by lowercase SAN piece letters ("pnbrqk").  Empty squares 
are represented by the digits one through eight; the digit used 
represents the count of contiguous empty squares along a rank.  The 
contents of all eight squares on each rank must be specified; therefore, 
the count of piece letters plus the sum of the vacant square counts must 
always equal eight.  The solidus character "/" (forward slash) is used 
to separate data of adjacent ranks.  There is no leading or trailing 
solidus in the piece placement data; hence there are exactly seven of 
solidus characters in the placement field.

The piece placement data for the starting array is:

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR


4.2: Active color

The second field represents the active color.  A lower case "w" is used 
if White is to move; a lower case "b" is used if Black is the active 
player.

The piece placement and active color data for the starting array is:

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w


4.3: Castling availability

The third field represents castling availability.  This indicates 
potential future castling that may or may not be possible at the moment 
due to blocking pieces or enemy attacks.  If there is no castling 
availability for either side, the single character symbol "-" is used.  
Otherwise, a combination of from one to four characters are present.  If 
White has kingside castling availability, the uppercase letter "K" 
appears.  If White has queenside castling availability, the uppercase 
letter "Q" appears.  If Black has kingside castling availability, the 
lowercase letter "k" appears.  If Black has queenside castling 
availability, then the lowercase letter "q" appears.  Those letters 
which appear will be ordered first uppercase before lowercase and second 
kingside before queenside.  There is no white space between the letters.

The piece placement, active color, and castling availability data for 
the starting array is:

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq


4.4: En passant target square

The fourth field is the en passant target square.  If there is no en 
passant target square then the single character symbol "-" appears.  If 
there is an en passant target square then is represented by a lowercase 
file character (one of "abcdefgh") immediately followed by a rank digit.  
Obviously, the rank digit will be "3" following a white pawn double 
advance (Black is the active color) or else be the digit "6" after a 
black pawn double advance (White being the active color).

An en passant target square is given if and only if the last move was a 
pawn advance of two squares.  Therefore, an en passant target square 
field may have a square name even if there is no pawn of the opposing 
side that may immediately execute the en passant capture.

The piece placement, active color, castling availability, and en passant 
target square data for the starting array is:

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -


5: Operations

An EPD operation is composed of an opcode followed by zero or more 
operands and is concluded by a semicolon.

Multiple operations are separated by a single space character.  If there 
is at least one operation present in an EPD line, it is separated from 
the last (fourth) data field by a single space character.


5.1: General format of opcodes and operands

An opcode is an identifier that starts with a letter character and may 
be followed by up to fourteen more characters.  Each additional 
character may be a letter or a digit or the underscore character.  
Traditionally, no uppercase letters are used in opcode names that are to 
be used by more than one program.


An operand is either a set of contiguous non-white space printing 
characters or a string.  A string is a set of contiguous printing 
characters delimited by a quote (ASCII code: 34 decimal, 0x22 
hexadecimal) character at each end.  A string value must have less than 
256 bytes of data.  This count does not include the traditional ASCII 
NUL character terminator.

If at least one operand is present in an operation, there is a single 
space between the opcode and the first operand.  If more than one 
operand is present in an operation, there is a single blank character 
between every two adjacent operands.  If there are no operands, a 
semicolon character is appended to the opcode to mark the end of the 
operation.  If any operands appear, the last operand has an appended 
semicolon that marks the end of the operation.

Any given opcode appears at most once per EPD record.  Multiple 
operations in a single EPD record should appear in ASCII order of their 
opcode names (mnemonics).  However, a program reading EPD records may 
allow for operations not in ASCII order by opcode mnemonics; the 
semantics are the same in either case.

Some opcodes that allow for more than one operand may have special 
ordering requirements for the operands.  For example, the "pv" 
(predicted variation) opcode requires its operands (moves) to appear in 
the order in which they would be played.  Most other opcodes that allow 
for more than one operand should have operands appearing in ASCII order.  
An example of the latter set is the "bm" (best move[s]) opcode; its 
operands are moves that are all immediately playable from the current 
position.


5.2: Operand basetypes

Operand values are represented using a variety of basetypes.


5.2.1:  Identifier basetype

Some opcodes require one of more operands that are identifiers.  An 
identifier is an unquoted sequence of one to fifteen characters.  The 
characters are selected from the upper and lower case letters, the ten 
digits, and the underscore character.  Most identifiers that may appear 
in EPD are taken from predefined sets as explained in the sections 
covering opcode semantics.

Identifiers are most often used to select one value from a list of 
possible values for a general attribute.  They are also used to 
represent PGN tag attributes.


5.2.2:  Chess move basetype

Some opcodes require one or more operands that are chess moves.  These 
moves should be represented using SAN (Standard Algebraic Notation).  If 
a different representation is used, there is no guarantee that the EPD 
will be read correctly during subsequent processing.  In particular, EDN 
(English Descriptive Notation), CCN (Computer Coordinate Notation), and 
LAN (Long Algebraic Notation) are explicitly not supported.

Chess moves are used most often in single operand operations to select 
one move from the available moves.  They are also used in multiple 
operand operations to define a set of moves (all taken from available 
moves) and in multiple operand operations to express a sequence of moves 
(taken from moves available at each point in a forward sequence of 
play).

Note that some chess moves also qualify as identifiers.  However, the 
semantics of a particular opcode dictate the exact basetype 
interpretation of its operands, so there is no ambiguity.


5.2.3:  Integer basetype

Some opcodes require one or more operands that are integers.  Some 
opcodes may require that an integer operand must be within a given 
range; the details are described in the opcode list given below.  A 
negative integer is formed with a hyphen (minus sign) preceding the 
integer digit sequence.  An optional plus sign may be used for 
indicating a non-negative value, but such use is not required and is 
discouraged.  Support for integers in the range -2147483648 to 
2147483647 (32 bit two's complement signed extrema) is required.

Integers are used to represent centipawn scores and also for various 
counts, limits, and totals.


5.2.4:  Floating basetype

Some opcodes require one or more operands that are floating point 
numbers.  Some opcodes may require that a floating point operand must be 
within a given range; the details are described in the opcode list given 
below.  A floating point operand is constructed from an optional sign 
character ("+" or "-"), a digit sequence (with at least one digit), a 
radix point (always "."), and a final digit sequence (with at least one 
digit).  There is currently no provision for scientific representation 
of numeric values.

The floating basetype in not in current use.


5.2.5:  Date basetype

Some opcodes require one or more operands that represent dates.  These 
are given in a special date format composed of ten characters.  The 
first four characters are digits that give the year (0001-9999), the 
fifth character is a period, the sixth and seventh characters are digits 
that give the month number (01-12), the eighth character is a period, 
and the ninth and tenth characters are digits that give the day number 
in the month (01-31).

The date basetype is used to specify date values in timestamps.


5.2.6:  Time of day basetype

Some opcodes require one or more operands that represent a time of day.  
These are given in a special time of day format composed of eight 
characters.  The first two characters are digits that give the hour 
(00-23), the third character is a colon, the fourth and fifth characters 
are digits that give the minute (00-59), the sixth character is a colon, 
and the seventh and eighth characters are digits that give the second 
(00-59).

The time of day basetype is used to specify time of day values in 
timestamps.


5.2.7:  Clock basetype

Some opcodes require one or more operands that represent a total amount 
of time as would be measured by a traditional digital clock.  These are 
given in a special clock format composed of 12 characters.  The first 
three characters are digits giving a count of days (000-999), the fourth 
character is a colon, the fifth and sixth characters are digits giving a 
count of hours (00-23), the seventh character is a colon, the eighth and 
ninth characters are digits giving a count of minutes (00-59), the tenth 
character is a colon, and the eleventh and twelfth characters are digits 
giving a count of seconds (00-59).

The clock basetype is used to specify clock values for chess clock 
information.  It is not used to measure time consumption for a search; 
an integer count of seconds is used instead.


5.3: Opcode mnemonics

An opcode mnemonic used for archival storage and for interprogram 
communication starts with a lower case letter and is composed of only 
lower case letters, digits, and the underscore character (i.e., no upper 
case letters).  Mnemonics are all at least two characters long.

Opcode mnemonics used only by a single program or an experimental suite 
of programs should start with an upper case letter.  This is so they may 
be easily distinguished should they be inadvertently be encountered by 
other programs.  When a such a "private" opcode be demonstrated to be 
widely useful, it should be brought into the official list (appearing 
below) in a lower case form.

If a given program does not recognize a particular opcode, that 
operation is simply ignored; it is not signaled as an error.


6: Opcode list

The opcodes are listed here in ASCII order of their mnemonics.  
Suggestions for new opcodes should be sent to the technical contact 
listed near the start of this document.


6.1: Opcode "acn": analysis count: nodes

The opcode "acn" takes a single non-negative integer operand.  It is 
used to represent the number of nodes examined in an analysis or search.  
Note that the value may be quite large for some extended searches and so 
use of a long (four byte) representation is suggested.


6.2: Opcode "acs": analysis count: seconds

The opcode "acs" takes a single non-negative integer operand.  It is 
used to represent the number of seconds used for an analysis or search.  
Note that the value may be quite large for some extended searches and so 
use of a long (four byte) representation is suggested.  Also note that 
the special clock format is not used for this operand.  Some systems can 
distinguish between elapsed time and processor time; in such cases, the 
processor time should be used as its value is usually more indicative of 
search effort than wall clock time.


6.3: Opcode "am": avoid move(s)

The opcode "am" indicates a set of zero or more moves, all immediately 
playable from the current position, that are to be avoided as a search 
result.  Each operand is a SAN move; they appear in ASCII order.


6.4: Opcode "bm": best move(s)

The opcode "bm" indicates a set of zero or more moves, all immediately 
playable from the current position, that are judged to the best 
available by the EPD writer and so each is allowable as a search result.  
Each operand is a SAN move; they appear in ASCII order.


6.5: Opcode "c0": comment (primary, also "c1" though "c9")

The opcode "c0" (lower case letter "c", digit character zero) indicates 
a top level comment that applies to the given position.  It is the first 
of ten ranked comments, each of which has a mnemonic formed from the 
lower case letter "c" followed by a single decimal digit.  Each of these 
opcodes takes either a single string operand or no operand at all.

This ten member comment family of opcodes is intended for use as 
descriptive commentary for a complete game or game fragment.  The usual 
processing of these opcodes are as follows:

1) At the beginning of a game (or game fragment), a move sequence 
scanning program initializes each element of its set of ten comment 
string registers to be null.

2) As the EPD record for each position in the game is processed, the 
comment operations are interpreted from left to right.  (Actually, all 
operations in an EPD record are interpreted from left to right.)  
Because operations appear in ASCII order according to their opcode 
mnemonics, opcode "c0" (if present) will be handled prior to all other 
opcodes, then opcode "c1" (if present), and so forth until opcode "c9" 
(if present).

3) The processing of opcode "cN" (0 <= N <= 9) involves two steps.  
First, all comment string registers with an index equal to or greater 
than N are set to null.  (This is the set "cN" though "c9".)  Second, 
and only if a string operand is present, the value of the corresponding 
comment string register is set equal to the string operand.


6.6: Opcode "cc": chess clock values

The opcode "cc" is used to indicate the amount of time used for each 
side at the time of the writing of the opcode to the EPD record.  This 
opcode always takes two values.  Both values are in clock format.  The 
first is the amount of time consumed by White and the second is the 
amount of time consumed by Black.  Note that these values are not simple 
integers.  Also, there is no provision for recording at a resolution of 
less than one second.

This opcode is most commonly used by a mediation program as a source of 
impartial time information for a pair of opposing players.


6.7: Opcode "ce": centipawn evaluation

The opcode "ce" indicates the evaluation of the indicated position in 
centipawn units.  It takes a single operand, an optionally signed 
integer that gives an evaluation of the position from the viewpoint of 
the active player; i.e., the player with the move.  Positive values 
indicate a position favorable to the moving player while negative values 
indicate a position favorable to the passive player; i.e., the player 
without the move.  A centipawn evaluation value close to zero indicates 
a neutral positional evaluation.

Values are restricted to integers that are equal to or greater than 
-32768 and 
are less than or equal to 32766.

A value greater than 32000 indicates the availability of a forced mate 
to the active player.  The number of plies until mate is given by 
subtracting the evaluation from the value 32767.  Thus, a winning mate 
in N fullmoves is a mate in ((2 * N) - 1) halfmoves (or ply) and has a 
corresponding centipawn evaluation of (32767 - ((2 * N) - 1)).  For 
example, a mate on the move (mate in one) has a centipawn evaluation of 
32766 while a mate in five has a centipawn evaluation of 32758.

A value less than -32000 indicates the availability of a forced mate to 
the passive player.  The number of plies until mate is given by 
subtracting the evaluation from the value -32767 and then negating the 
result.  Thus, a losing mate in N fullmoves is a mate in (2 * N) 
halfmoves (or ply) and has a corresponding centipawn evaluation of 
(-32767 + (2 * N)).  For example, a mate after the move (losing mate in 
one) has a centipawn evaluation of -32765 while a losing mate in five 
has a centipawn evaluation of -32757.

A value of -32767 indicates that the side to move is checkmated.  A 
value of -32768 indicates an illegal position.  A stalemate position has 
a centipawn evaluation of zero as does a position drawn due to 
insufficient mating material.  Any other position known to be a certain 
forced draw also has a centipawn evaluation of zero.


6.8: Opcode "dm": direct mate fullmove count

The "dm" opcode is used to indicate the number of fullmoves until 
checkmate is to be delivered by the active color for the indicated 
position.  It always takes a single operand which is a positive integer 
giving the fullmove count.  For example, a position known to be a "mate 
in three" would have an operation of "dm 3;" to indicate this.

This opcode is intended for use with problem sets composed of positions 
requiring direct mate answers as solutions.


6.9: Opcode "draw_accept": accept a draw offer

The opcode "draw_accept" is used to indicate that a draw offer made 
after the move that lead to the indicated position is accepted by the 
active player.  This opcode takes no operands.

The "draw_accept" opcode should not appear on the same EPD record as a 
"draw_reject" opcode.


6.10: Opcode "draw_claim": claim a draw

The opcode "draw_claim" is used to indicate claim by the active player 
that a draw exists.  The draw is claimed because of a third time 
repetition or because of the fifty move rule or because of insufficient 
mating material.  A supplied move (see the opcode "sm") is also required 
to appear as part of the same EPD record.  The "draw_claim" opcode takes 
no operands.

The "draw_claim" opcode should not appear on the same EPD record as a 
"draw_offer" opcode.


6.11: Opcode "draw_offer": offer a draw 

The opcode "draw_offer" is used to indicate that a draw is offered by 
the active player.  A supplied move (see the opcode "sm") is also 
required to appear as part of the same EPD record; this move is 
considered played from the indicated position.  The "draw_offer" opcode 
takes no operands.

The "draw_offer" opcode should not appear on the same EPD record as a 
"draw_claim" opcode.


6.12: Opcode "draw_reject": reject a draw offer

The opcode "draw_reject" is used to indicate that a draw offer made 
after the move that lead to the indicated position is rejected by the 
active player.  This opcode takes no operands.

The "draw_reject" opcode should not appear on the same EPD record as a 
"draw_accept" opcode.


6.13: Opcode "eco": _Encyclopedia of Chess Openings_ opening code

The opcode "eco" is used to associate an opening designation from the 
_Encyclopedia of Chess Openings_ taxonomy with the indicated position.  
The opcode takes either a single string operand (the ECO opening name) 
or no operand at all.  If an operand is present, its value is associated 
with an "ECO" string register of the scanning program.  If there is no 
operand, the ECO string register of the scanning program is set to null.

The usage is similar to that of the "ECO" tag pair of the PGN standard.


6.14: Opcode "fmvn": fullmove number

The opcode "fmvn" represents the fullmove number associated with the 
position.  It always takes a single operand that is the positive integer 
value of the move number.  The value of the fullmove number for the 
starting array is one.

This opcode is used to explicitly represent the fullmove number in EPD 
that is present by default in FEN as the sixth field.  Fullmove number 
information is usually omitted from EPD because it does not affect move 
generation (commonly needed for EPD-using tasks) but it does affect game 
notation (commonly needed for FEN-using tasks).  Because of the desire 
for space optimization for large EPD files, fullmove numbers were 
dropped from EPD's parent FEN.  The halfmove clock information was 
similarly dropped.


6.15: Opcode "hmvc": halfmove clock

The opcode "hmvc" represents the halfmove clock associated with the 
position.  The halfmove clock of a position is equal to the number of 
plies since the last pawn move or capture.  This information is used to 
implement the fifty move draw rule.  It always takes a single operand 
that is the non-negative integer value of the halfmove clock.  The value 
of the halfmove clock for the starting array is zero.

This opcode is used to explicitly represent the halfmove clock in EPD 
that is present by default in FEN as the fifth field.  Halfmove clock 
information is usually omitted from EPD because it does not affect move 
generation (commonly needed for EPD-using tasks) but it does affect game 
termination issues (commonly needed for FEN-using tasks).  Because of 
the desire for space optimization for large EPD files, halfmove clock 
values were dropped from EPD's parent FEN.  The fullmove number 
information was similarly dropped.


6.16: Opcode "id": position identification

The opcode "id" is used to provide a simple identification label for the 
indicated position.  It takes a single string operand.

This opcode is intended for use with test suites used for measuring 
chessplaying program strength.  An example "id" operand for the seven 
hundred fifty seventh position of the one thousand one problems in 
Reinfeld's _1001 Winning Chess Sacrifices and Combinations_ would be 
"WCSAC.0757" while the fifteenth position in the twenty four problem 
Bratko-Kopec test suite would have an "id" operand of "BK.15".


6.17: Opcode "nic": _New In Chess_ opening code

The opcode "nic" is used to associate an opening designation from the 
_New In Chess_ taxonomy with the indicated position.  The opcode takes 
either a single string operand (the NIC code for the opening) or no 
operand at all.  If an operand is present, its value is associated with 
an "NIC" string register of the scanning program.  If there is no 
operand, the NIC string register of the scanning program is set to null.

The usage is similar to that of the "NIC" tag pair of the PGN standard.


6.18: Opcode "noop": no operation

The "noop" opcode is used to indicate no operation.  It takes zero or 
more operands, each of which may be of any type.  The operation involves 
no processing.  It is intended for use by developers for program testing 
purposes.


6.19: Opcode "pm": predicted move

The "pm" opcode is used to provide a single predicted move for the 
indicated position.  It has exactly one operand, a move playable from 
the position.  This move is judged by the EPD writer to represent the 
best move available to the active player.

If a non-empty "pv" (predicted variation) line of play is also present 
in the same EPD record, the first move of the predicted variation is the 
same as the predicted move.

The "pm" opcode is intended for use as a general "display hint" 
mechanism.


6.20: Opcode "ptp": PGN tag pair

The "ptp" opcode is used to record a PGN tag pair.  It always takes an 
even number of operands.  For each pair of operands (from left to 
right), the first operand in the pair is always an identifier and is 
interpreted as the name of a PGN tag; the second operand in the pair is 
always a string and is the value associated with the tag given by the 
first operand.

Any given PGN tag name should only appear once as a tag identifier 
operand in a "ptp" operation.


6.21: Opcode "pv": predicted variation

The "pv" opcode is used to provide a predicted variation for the 
indicated position.  It has zero or more operands which represent a 
sequence of moves playable from the position.  This sequence is judged 
by the EPD writer to represent the best play available.

If a "pm" (predicted move) operation is also present in the same EPD 
record, the predicted move is the same as the first move of the 
predicted variation.


6.22: Opcode "rc": repetition count

The "rc" opcode is used to indicate the number of occurrences of the 
indicated position.  It takes a single, positive integer operand.  Any 
position, including the initial starting position, is considered to have 
an "rc" value of at least one.  A value of three indicates a candidate 
for a draw claim by the position repetition rule.


6.23: Opcode "refcom": referee command

The "refcom" opcode is used to represent a command from a referee 
program to a client program during automated competition.  It takes a 
single identifier operand which is to be interpreted as a command by the 
receiving program.  Note that as the operand is an identifier and not a 
string value, it is not enclosed in quote characters.

There are seven available operand values: conclude, disconnect, execute, 
fault, inform, reset, and respond.

Further details of "refcom" usage are given in the section on referee 
semantics later in this document.


6.24: Opcode "refreq": referee request

The "refreq" opcode is used to represent a request from a client program 
to the referee program during automated competition.  It takes a single 
identifier operand which is to be interpreted as a request to the 
referee from a client program.  Note that as the operand is an 
identifier and not a string value, it is not enclosed in quote 
characters.

There are four available operand values: fault, reply, sign_off, and 
sign_on.

Further details of "refreq" usage are given in the section on referee 
semantics later in this document.


6.25: Opcode "resign": game resignation

The opcode "resign" is used to indicate that the active player has 
resigned the game.  This opcode takes no operands.

The "resign" opcode should not appear on the same EPD record with any of 
the following opcodes: "draw_accept", "draw_claim", "draw_decline', and 
"draw_offer".


6.26: Opcode "sm": supplied move

The "sm" opcode is used to provide a single supplied move for the 
indicated position.  It has exactly one operand, a move playable from 
the position.  This move is the move to be played from the position.

If a "sv" (supplied variation) operation is present on the same record 
and has at least one operand, then its first operand must match the 
single operand of the "sm" opcode.

The "sm" opcode is intended for use to communicate the most recent 
played move in an active game.  It is used to communicate moves between 
programs in automatic play via a network.  This includes correspondence 
play using e-mail and also programs acting as network front ends to 
human players.


6.27: Opcode "sv": supplied variation

The "sv" opcode is used to provide zero or more supplied moves for the 
indicated position.  The operands are a move sequence playable from the 
position.

If an "sm" (supplied move) operation is also present on the same record 
and the "sv" operation has at least one operand, then the "sm" operand 
must match the first operand of the "sv" operation.


6.28: Opcode "tcgs": telecommunication: game selector

The "tcgs" opcode is one of the telecommunication family of opcodes used 
for games conducted via e-mail and similar means.  This opcode takes a 
single operand that is a positive integer.  It is used to select among 
various games in progress between the same sender and receiver.

Details of e-mail implementation await further development.


6.29: Opcode "tcri": telecommunication: receiver identification

The "tcri" opcode is one of the telecommunication family of opcodes used 
for games conducted via e-mail and similar means.  This opcode takes two 
order dependent string operands.  The first operand is the e-mail 
address of the receiver of the EPD record.  The second operand is the 
name of the player (program or human) at the address who is the actual 
receiver of the EPD record.

Details of e-mail implementation await further development.


6.30: Opcode "tcsi": telecommunication: sender identification

The "tcsi" opcode is one of the telecommunication family of opcodes used 
for games conducted via e-mail and similar means.  This opcode takes two 
order dependent string operands.  The first operand is the e-mail 
address of the sender of the EPD record.  The second operand is the name 
of the player (program or human) at the address who is the actual sender 
of the EPD record.

Details of e-mail implementation await further development.


6.31: Opcode "ts": timestamp

The "ts" opcode is used to record a timestamp value.  It takes two 
operands.  The first operand is in date format and the second operand is 
in time of day format. The interpretation of the combined operand values 
gives the time of the last modification of the EPD record.  The 
timestamp is interpreted to be in UTC (Universal Coordinated Time, 
formerly known as GMT).


6.32: Opcode "v0": variation name (primary, also "v1" though "v9")

The opcode "v0" (lower case letter "v", digit character zero) indicates 
a top level variation name that applies to the given position.  It is 
the first of ten ranked variation names, each of which has a mnemonic 
formed from the lower case letter "v" followed by a single decimal 
digit.  Each of these opcodes takes either a single string operand or no 
operand at all.

This ten member variation name family of opcodes is intended for use as 
traditional variation names for a complete game or game fragment.  The 
usual processing of these opcodes are as follows:

1) At the beginning of a game (or game fragment), a move sequence 
scanning program initializes each element of its set of ten variation 
name string registers to be null.

2) As the EPD record for each position in the game is processed, the 
variation name operations are interpreted from left to right.  
(Actually, all operations in an EPD record are interpreted from left to 
right.)  Because operations appear in ASCII order according to their 
opcode mnemonics, opcode "v0" (if present) will be handled prior to all 
other opcodes, then opcode "v1" (if present), and so forth until opcode 
"v9" (if present).

3) The processing of opcode "vN" (0 <= N <= 9) involves two steps.  
First, all variation name string registers with an index equal to or 
greater than N are set to null.  (This is the set "vN" though "v9".)  
Second, and only if a string operand is present, the value of the 
corresponding variation name string register is set equal to the string 
operand.


7: EPD processing verbs

An EPD processing verb is a command to an EPD capable program used to 
direct processing of one or more EPD files.  Standardization of verb 
semantics among EPD capable programs is important to helping reduce 
confusion among program users and to better insure overall 
interoperatibilty.

Each EPD processing verb that requires the reading of EPD records has a 
specific set of required opcodes that must be on each input record.  
Each EPD processing verb that requires the writing of EPD records has a 
specific set of required opcodes that must be on each output record.  
Some EPD processing verbs imply both reading and writing EPD records; 
these will have requirements for both input and output opcode sets.

The names of the EPD processing verbs in this section are for use for 
specification purposes only.  Program authors are free to select 
different names as appropriate for the needs of a program's user 
interface.


7.1: EPD verb: pfdn (process file: data normalization)

The "pfdn" (process file: data normalization) verb reads an EPD input 
file and produces a normalized copy of the data on as the EPD output 
file.  The output file retains the record ordering of the input file.  
The noramlization is used to produce a canonical representation of the 
EPD.  The input records are also checked for legality.  There is no 
minimum set of operations requires on the input records.  For each input 
record, all of the operations present are reproduced in the 
corresponding output record.

The normalization of each EPD record consists of the following actions:

1) Any leading whitespace characters are removed.

2) Any trailing whitespace characters are removed.

3) Any unneeded whitespace characters used as data separators are 
removed; a single blank is used to separate adjacent fields, adjacent 
operations, and adjacent operands.  Also, a single blank character is 
used to separate the fourth position data field (the en passant target 
square indication) from the first operation (if present).

4) Operations are reordered in increasing ASCII order by opcode 
mnemonic.

5) Operands for each opcode that does not require a special order of 
interpretation are reordered in increasing ASCII order by external 
representation.

Data normalization is useful for making a canonical version from data 
produced by programs or other sources that do not completely conform to 
the lexigraphical and ordering rules of the EPD standard.  It also helps 
when comparing two EPD files from different sources on a line by line 
basis; the non-semantic differences are removed so that different text 
lines indicate true semantic difference.


7.2: EPD verb: pfga (process file: general analysis)

The "pfga" (process file: general analysis) verb is used to instruct a 
chessplaying program to perform an analysis for each EPD input record 
and produce an EPD output file containing this analysis.  The output 
file retains the record ordering of the input file.  The current 
position given by each input record is not changed; it is copied to the 
output.

Each input EPD record receives the same analysis effort.  The level of 
effort is indicated as a command (separate from EPD) to the analysis 
program prior to the start of the EPD processing.  Usually, the level is 
given as a time limit or depth limit per each position.  The limit can 
be either a hard limit or a soft limit.  A hard limit represents an 
absolute maximum effort per position, while a soft limit allows the 
program to spend more or less effort per position.  The hard limit 
interpretation is preferred for comparing programs.  The soft limit 
interpretation is used to help test time allocation strategy where a 
program can choose to take more or less time depending on the complexity 
of a position.

Each EPD output record is a copy of the corresponding EPD input record 
with new analysis added as a result of the verb processing.

There is no minimum set of operations required for the EPD input 
records.

Each output EPD record must contain:

1) A "pv" (predicted variation) operation.  The operands of this form a 
sequence of chess moves to be played from the given position.  The 
length of this may vary from record to record due to the level of 
anaylsis effort and the complexity of each position. However, unless the 
current position represents a checkmate or stalemate for the side to 
move, the pv operation must include at least one move.  If the current 
position represents a checkmate or stalemate for the side to move, then 
the pv operation still appears, but has no operands.

2) A "ce" (centipawn evaluation) operation.  The value of its operand is 
the value in hundredths of a pawn of the current position.  Note that 
the evaluation is assigned to the position before the predicted move (or 
any other move) is made.  Thus, a positive centipawn score indicates an 
advantage for the side to move in the current position while a negative 
score indicates a disadvantage for the side to move.

Each output EPD record may also contain:

1) A "pm" (predicted move) operation, unless the current position 
represents a checkmate or stalemate for the side to move.  (If the side 
to move has no moves, then the "pm" operation will not appear.)  The 
single operand of the "pm" opcode must be the same as the first operand 
of the "pv" sequence.

2) A "sm" (supplied move) operation, unless the current position 
represents a checkmate or stalemate for the side to move.  (If the side 
to move has no moves, then the "sm" operation will not appear.)  The 
single operand of the "sm" opcode must be the same as the first operand 
of the "pv" sequence.

3) An "acn" (analysis count: nodes) operation.  The single operand is 
the number of nodes visited in the analysis search for the position.

4) An "acs" (analysis count: seconds) operation.  The single operand is 
the number of seconds used for the analysis search for the position.


7.3: EPD verb: pfms (process file: mate search)

The "pfms" verb is used to conduct searches for forced checkmating 
sequences.  The length of the forced mate sequence is provided (outside 
of EPD) to the program prior to the beginning of "pfms" processing.  The 
length is specified using a fullmove count.  For example, a fullmove 
mate length of three would instruct the program to search for all mates 
in three.  An analysis program reads and input EPD file and looks for 
forced mates in each position where no forced mate of equal or lesser 
length has been recorded.  The output file retains the record ordering 
of the input file.

The action of the "pfms" command on each record is governed by the 
pre-specified fullmove count and, if present on the record, the value of 
the "dm" (direct mate fullmove count) operand.  A particular record will 
be subject to a search for a forced mate if either:

1) There is no "dm" operation on the input record, or

2) The value of the "dm" operand on the input record is greater than the 
value of the pre-specified fullmove analysis length.

If the analysis program finds a forced mate, it produces two additional 
operations on the corresponding output EPD record:

1) A "dm" operation with an operand equal to the pre-specified fullmove 
mate length.

2) A "pm" operation with the first move of the mating sequence as its 
operand.  If two or more such moves exist, the program selects the first 
one it located to appear as the "pm" operand.

The idea is that a set of positions can be repeatedly scanned by a mate 
finding program with the fullmove analysis depth starting with a value 
of one and being increased by one with each pass.  For any given pass, 
the positions solved by an earlier pass are skipped.

The output EPD records may also contain other (optional) information 
such as "acn", "acs", and "pv" operations.


7.4: EPD verb: pfop (process file: operation purge)

The "pfop" verb is used to purge a particular operation from each of the 
records in an EPD file that contain the operation.  The output file 
retains the record ordering of the input file.  Prior to processing, the 
opcode of the operation to be purged is specified.

The records of the input file are copied to the output file.  If the 
pre-specified operation is present on a record, the operation is removed 
prior to copying the record to the output.


7.5: EPD verb: pfts (process file: target search)

The "pfts" (process file: target search) verb is similar to the "pfga" 
(process file: general analysis) verb in that each position on the EPD 
input file is subject to a general analysis.  The difference is that 
each input record contains a set of target moves and a set of avoidance 
moves.  Either of these two sets, but not both, may be empty.  The set 
of avoidance moves is given by the operands of a "am" opcode (if 
present).  The set of target moves is given by the operands of a "bm" 
opcode (if present).

Prior to processing the target search, the program is given a search 
effort limit such as a limit on the amount of search time or search 
nodes per position.  The "pfts" verb causes each input EPD record to be 
read, subjected to analysis, and then written to output file with the 
predicted move attached with the "pm" opcode.  (No "pm" operation is 
added is the current position is a checkmate or stalemate of the side to 
play.)

The output EPD records may also contain other (optional) information 
such as "acn", "acs", and "pv" operations.


8: EPD referee semantics

Communication between a chessplaying program and a referee program is 
performed by exchanging EPD records.  Each EPD record emitted by a 
chessplaying program to be received by the referee has a "refreq" EPD 
opcode with an operand that describes the request.  Each EPD record 
emitted by a referee to be received by a chessplaying program has a 
"refcom" EPD opcode with an operand that describes the command.

The usual operation sequence in a referee mediated event is as follows:

1) The referee server program is started and the human event supervisor 
provides it with any necessary tournament information including the 
names of the chessplaying programs, the name of the event, and various 
other data.

2) The referee program completes its initialization by performing 
pairing operations as required.

3) Once the server has its initial data, it then opens a socket and 
binds it to the appropriate port.  It then starts listening for input 
from clients.  For a serial implementation, an analogous function is 
performed.

4) The competing chessplaying programs (clients) are started (if not 
already running) and are given the name of the referee host machine 
along with the port number.  For a serial implementation, an analogous 
function is performed.

5) Each client program transmits an EPD record to the referee requesting 
registration.  This causes each client to be signed on to the referee.

6) The referee program replies to each client signing on with an EPD 
record commanding a reset operation to set up for a new game.

7) The referee program sends an EPD record to each client informing each 
client about the values for each of the tag values for the PGN Seven Tag 
Format.

8) For each client on the move, the referee will send an EPD record 
commanding a response.  This causes each receiving client to calculate a 
move.  If there has been a prior move, it along with the position from 
which the move is played is sent.  If there has been no prior move, the 
current position is sent but no move is included.

9) For each client receiving a command to respond, the current position 
indicated by the record is set as the current position in the receiving 
program.  (It should already be the current position in the receiver.)  
If a supplied move was given, it is executed on the current position.  
Finally, the receiving program calculates a move.

10) As each program on the move completes its calculation, it sends a 
reply to the referee which includes the result of the calculation.  The 
position sent back on the reply is the result of applying the move 
received on the referee record to the position on the same received 
record.  If a move was produced as the result of the calculation, it is 
also sent.  (A move will not be produced or sent if the receving client 
was checkmated, or if it was stalemated, of if it resigns, or claims a 
draw due to insufficient material.)

11) As the referee receives a reply from a client, it produces a respond 
command record to the client's opponent.  (This step will be skipped if 
an end of game condition is detected and no further moves need to be 
communicated.)

12) The referee continues with the respond/reply cycle for each pair of 
opponent clients until the game concludes for that pair.

13) For each game conclusion, the referee sends a conclude command to 
each of the clients involved.

14) When a client is to be removed from competition, it sends a sign off 
request.  This eliminates that program from being paired until it 
re-registers with a sign on request.

15) When the referree server is to be removed from network operations, 
it will send a disconnect command to each client that is currently 
signed on to the referee.

8.1: Referee commands (client directives)

The referee communicates the command of interest as the single operand 
of the "refcom" opcode.  The refcom opcode will be on each record sent 
by the referee.  Each possible refcom operand is sent as an identifier 
(and not as a string).

EPD records sent by the referee will include check clock data as 
appropriate.  Whenever a client program receives a record with the "cc" 
(chess clock) opcode, the client should set the values of its internal 
clocks to the values specified by the cc operands.  Note that the clock 
values for both White and Black are present in a cc operation.

All EPD records carry the four data fields describing the current 
position.  In most cases, this position should also be the current 
position of the receiving client.  If the position sent by the referee 
matches the client's current position, then the client can assume that 
all of the game history leading to the current position is valid.  Thus, 
every client keeps track of the game history internally and uses this to 
detect repetition draws and so there is no need for each EPD record to 
contain a complete copy of the game history.

If the position sent by the referee does not match the receiving 
program's current position, then the receiving program must set its 
current position to be the same as the one it received.  Unless an 
explicit game history move sequence is also sent on the same EPD record, 
the receiving program is to assume that the new (different) position 
received has no game history.  In this case the receiving program cannot 
check for repetition of positions prior to the new position as there 
aren't any previous positions in the game.

Each client is expected to maintain its own copy of the halfmove clock 
(plies since last irreversible move; starts at zero for the initial 
position) and the fullmove number (which has a value of one for the 
initial position).  If the referee sends a halfmove clock value or a 
fullmove number which is different from that kept by the program, then 
the receiving program is to treat it as a new position and clear any 
game history.  As noted above, a halfmove clock is sent using the "hmvc" 
opcode and a fullmove number is sent using a "fmvn" opcode.

If a supplied move (always using the "sm" opcode) is sent by the 
referee, the receiving program must execute this move on the current 
position.  This is done after the program's current position is set to 
the position sent by the referee (remember that the two will usually 
match).  The resulting position becomes the new current position.  This 
new current position is used for all further calculations.  The new 
current position is also the position to be sent to the referee if a 
move response is commanded.  When a client program produces a move to be 
played, it uses the sm opcode with its operand being the supplied move.  
The position sent is alwasy the position from which the supplied move is 
to be played.  Thus, the semantics of the current position and the 
supplied move are symmetric with respect to the client and the server.


8.1.1: Referee command: conclude

The "conclude" refcom operand instructs the client to conclude the 
current game in progress.  The position sent is the final position of 
the game.  There is no supplied move sent.  No further EPD records 
concerning the game will be sent by the referee.  The client should 
perform any end of game activity required for its normal operation.  No 
response from the client is made.

To allow for client game conclusion processing time, the referee will 
avoid sending any more EPD records to a client concluding a game for a 
time period set by the human supervisor.  The default delay will be five 
seconds.


8.1.2: Referee command: disconnect

The "disconnect" refcom operand instructs the client that the referee is 
terminating service operations.  The client should close its 
communication channel with the server.  This command is sent at the end 
of an event or whenever the referee is to be brought down for some 
reason.  No further EPD records will be sent until the server is cycled.  
It provides an opportunity for a client to gracefully disconnect from 
network operations with the server.  No supplied move is sent.  The 
position sent is irrelevant.  No response from the client is made.


8.1.3: Referee command: execute

The "execute" refcom operand instructs the client to set up a position.  
If a move is supplied (it usually is), then that move is executed from 
the position.  The sent position will usually be the receiver's current 
position.  This command is used only to play through the initial 
sequence of moves from a game to support a restart capability.  No 
response is made by the receiver.


8.1.4: Referee command: fault

The "fault" refcom operand is used to indicate that the referee has 
detected an unrecoverable fault.  The reciever should signal for human 
intervention to assist with corrective action.  The human supervisor 
will be notified by the referee regarding the nature of the fault.  No 
response is made by the receiver.

A future version of the referee protocol will support some form of 
automated fault recovery.


8.1.5: Referee command: inform

The "inform" refcom operand is used to convey PGN tag pair data to the 
receiver.  The "ptp" opcode will carry the PGN tag data to be set on the 
receiving client.  This command may be sent at any time.  It will 
usually be sent prior to the first move of a game.  It will also be sent 
after the last move of a game to communicate the result of the game via 
the PGN "Result" tag pair.  No response is made by the receiver.

The main purpose for the inform referee command is to be able to 
communcate tag pair data to a client without having to send a move or 
other command.  Note that the ptp opcode may also appear on EPD records 
from the referee that are not inform commands; its operands are 
processed in the same way.

The usual information sent includes the values for the Seven Tag Roster.  
The PGN tag names are "Event", "Site", "Date", "Round", "White", 
"Black", and "Result".

Future versions of the referee will likely send more than just the Seven 
Tag Roster of PGN tag pairs.  One probable addition will be to send the 
"TimeControl" tag pair prior to the start of a game; this will allow a 
receiving program to have its time control parameters set automatically 
rather than manually.


8.1.6: Referee command: reset

The "reset" refcom operand is used to command the receiving client to 
set up for a new game.  Any previous information about a game in 
progress is deleted.  This command will be sent to mark the beginning of 
a game.  It will also be sent if there is a need to abort the game 
currently in progress.  No response is made by the receiver.

To allow for client reset processing time, the referee will avoid 
sending any more EPD records to a resetting client for a time period set 
by the human supervisor.  The default delay will be five seconds.


8.1.7: Referee command: respond

The "respond" refcom operand is used to command the receiving client to 
respond to the move (if any) played by its opponent.  The position to 
use for calculation is the position sent which is modified by a supplied 
move (if present; uses the "sm" opcode).  The client program calculates 
a response and sends it to the referee using the "reply" operand of the 
"refreq" opcode.


8.2: Referee requests (server directives)

The referee communicates the command of interest as the single operand 
of the "refcom" opcode.  The refcom opcode will be on each record sent 
by the referee.  Each possible refcom operand is sent as an identifier 
(and not as a string).


8.2.1: Referee request: fault


The "fault" refreq operand is used to indicate that the client has 
detected an unrecoverable fault.  The receiver should signal for human 
intervention to assist with corrective action.  The human supervisor 
will be notified by the referee regarding the nature of the fault.  No 
response is made by the referee.

A future version of the referee protocol will support some form of 
automated fault recovery.


8.2.2: Referee request: reply

The "reply" refreq operand is used to carry a reply by the client 
program.  Usually, a move (the client's reply) is included as the 
operand of the "sm" opcode.


8.2.3: Referee request: sign_off

The "sign_off" refreq operand is used to indicate that the client 
program is signing off from the referee connection and no further 
operations will be made on the communication channel.  The channel in 
use is then closed by both the referee and the client.

A new connection must be established and a new "sign_on" referee request 
needs to be made for further referee operations with the client.


8.2.4: Referee request: sign_on

The "sign_on" refreq operand is used to indicate that the client program 
is signing on to the referee connection.  This request is required 
before any further operations can be made on the communication channel.  
The channel in use remains open until it is closed by either side.


9: EPD report generation semantics

[TBD]


EPD_Spec: EOF